All-electron density functional theory and time-dependent density functional theory 

with high-order finite elements 
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We present for static density functional theory and time-dependent density functional theory cal- 
culations an all-electron method which employs high-order hierarchical finite-element bases. Our 
mesh generation scheme, in which structured atomic meshes are merged to an unstructured molec- 
ular mesh, allows a highly nonuniform discretization of the space. Thus it is possible to represent 
the core and valence states using the same discretization scheme, i.e., no pseudopotentials or similar- 
treatments are required. The nonuniform discretization also allows the use of large simulation cells, 
and therefore avoids any boundary effects. 
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I. INTRODUCTION 

The density functional theory (DFT) has become the 
workhorse in electronic structure theory^. Its success 
derives from the ability to produce accurate results with 
reasonable computational effort. Instead of solving for 
the many-body wavefunction it relies on Hohenberg- 
Kohn theorem 21 which states that all ground-state prop- 
erties - most importantly the total energy - are func- 
tional of the electron density. Actually, the total energy 
functional is not exactly know, but there exists several 
approximations, the accuracy of which can be system- 
atically improved according to the demands of the ap- 
plications in questio n 16 ! 31 . The most important issue is 
that the number of dimensions in a problem does not de- 
pend on the number of electrons, and thus DFT scales 
much better than many-body wavefunction methods, up 
to hundreds or thousands of atoms on the present super- 
computers. 

The DFT is bound to the ground-state properties and 
cannot be used to explore excited states. This draw- 
back can be overcome by using the time-dependent DFT 
(TDDFT)! 3 .. It is based on the Runge-Gross theorem^ 6 ., 
which states that (physically) different external poten- 
tials {e.g., those due to laser fields) lead to different 
time-evolutions of the density. The present functionals 
for TDDFT are known to be unable to describe certain 
phenomena, such as charge transfer excitations. How- 
ever, in recent years it has been successfully applied to 
description several other problems, for example, the opti- 
cal absorption spectra of a broad variety of systems, the 
nonlinear optical response {e.g., harmonic generation) of 
atoms and molecules, and coherent control of molecules 
by laser fields^ 3 -. 

For numerical solution, the partial differential equa- 
tions arising from DFT and TDDFT must be discretized 
in space. In the present-day codes, the most popular 
choices are atomic orbital base o 6 ' 18 i 41 , planewave a 20 ' 26 , 
and uniform real-space grid o 28 i 29 . In the atomic orbital 
bases the solution is represented as a linear combination 
of atomic solutions, which can be accurate {e.g., numeri- 



cal atomic orbitals 6 ) or approximate {e.g., Gaussiansi 8 -) . 
These bases are widely used and can be very fast and 
efficient. However, the atomic orbital bases are sensi- 
tive to the type of the problem in the sense that an effi- 
cient discretization for the ground state properties is not 
well suited for the calculation of optical absorption spec- 
tra. In particular, when the solution is not representable 
as slightly perturbed atomic solutions the atomic orbital 
bases become unfavourable. For example, this can hap- 
pen in the case of nonlinear time-dependent phenomena. 

The planewave bases and uniform real-space grids {i.e., 
the finite-difference method) are both uniform discretiza- 
tions of the space and closely related to each other 
through the Fourier transform. These discretizations are 
not dependent on the type of the problem, but they re- 
quire a large number of degrees of freedom. Especially, 
the core regions around nuclei, where solutions have very 
sharp features, cannot be represented well by uniform dis- 
cretization, but pseudopotential a 15 ! 34 ' 43 or similar treat- 
ments {e.g., projector-augmented wave method 5 ) must 
be employed. The pseudopotentials lead to additional 
parameters and may be hard to construct accurately for 
certain types of atoms, e.g., transition metals. Another 
drawback in uniform discretizations is their inability to 
adapt to the underlying geometry of the atoms. For ex- 
ample, in sparse matter interstitial regions should require 
much less degrees of freedom than regions near atoms. 
This is also the case in simulations of nonlinear time- 
dependent phenomena, where the distant regions in space 
should still be accounted for but the solution is smooth 
in this region so that the discretization can be coarse. 

The finite element basi o 42 ' 45 is a linear combination 
of continuous, piece-wise polynomials and provides a 
nonuniform real-space discretization of the space. It 
inherits the good properties of the real-space methods, 
such as, flexible boundary conditions and efficient paral- 
lelization via domain decomposition, while still allowing 
nonuniform discretization of the space. In this paper, 
we use high-order hierarchical finite elements, which i) 
provide a better rate of convergence than low-order ele- 
ments, and ii) result in better conditioned systems of lin- 
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ear equations than the nodal-based elements of the same 
order. As finite elements can adapt to the local feature 
size, they can be used to describe solutions of core and 
valence electrons equally well. And naturally, they are 
adaptable to the geometry of the system to avoid overdis- 
cretisation. The finite element basis is also variational 
like planewaves and atomic bases which is not the case 
for finite-difference discretizations. The finite element 
basis is extensively used in civil and mechanical engineer- 
ing, and in many fields it has surpassed finite-difference 
methods 3 -. There exists several earlier implementations 
of the finite-element methodology to electronic structure 
calculation a 1 ' 2 ' 14 ! 27 ' 32 ! 44 ' 46 ! 48 ' 49 . However, none of these 
uses high-order hierarchical elements on unstructured 
meshes or apply the method to TDDFT. The closest 
work to our approach is the spectral element method im- 
plementation of Batcho^. The spectral element method 
uses high-order tensor product bases, which enable fast 
evaluation of matrix elements and provide good conver- 
gence rates. However, the element types are restricted 
to brick (i.e., parallelepiped) elements and mapped brick 
elements (i.e., coordinate transformations of brick ele- 
ments) . 

The rest of the paper is structured as follows. In the 
next section, we briefly review the DFT, linear response 
TDDFT, and finite-element method. We also describe 
our mesh generation algorithm. In the section IIII1 we 
show several examples of applying our DFT and linear- 
response TDDFT method to small molecules (CO, Na2, 
CgHg) and discuss the convergence of the method. In the 
final section, we draw the conclusions and set directions 
for future research. 



II. THEORY 

A. Density functional theory 

In the density functional theory, the total energy 
I£[n(r)] is a functional of the electron density n(r), and 
the ground state of the system is found by minimizing it. 
However, the functional is not known in general and must 
be approximated. This is usually done by employing the 
Kohn-Sham™ scheme where the functional is divided into 
four parts: 

E[n] = T s [n] + J d 3 rn{r)v ext (r) + U[n] + E xc [n], (1) 

where T s [n] is kinetic energy of the non-interacting elec- 
tron system with density n(r), J d 3 rn(r)w ea;t (r) is the 
interaction energy with an external field (usually, that 
due to the ions), U[n] is the mean electron-electron re- 
pulsion energy (Hartree energy), and E xc [n] is the elec- 
tron exchange-correlation energy functional. The three 
first parts are known but the last one, the exchange- 
correlation functional, is not, and the quality of its ap- 
proximation is the key to accurate results. The Kohn- 



Sham scheme uses a set of orthonormal auxiliary func- 
tions , 0fc( r )j *- e -i th e Kohn-Sham orbitals, which satisfy 

^states 

n(r)= MMr)\ 2 , (2) 

k=i 

where are the occupation numbers, and N states is the 
number of occupied Kohn-Sham orbitals. By taking the 
functional derivative of the energy functional with re- 
spect to these functions, we obtain the Kohn-Sham equa- 
tions: 

H K s^k{v) = (~7^- v2 + v eff{r)^j ipk(r) = e*V*(r), 

(3) 

where 

v e ff (r) = vh [n] (r) + v xc [n] (r) + v ex t (r) (4) 
is the effective potential, and 

'-MM-j^/A^ (5, 

is the Hartree potential. Furthermore, i> X cM( r ) ' s the 
exchange-correlation potential, and v ex t(r) is the exter- 
nal potential, which is usually a sum of electron-nucleus 
interactions, i.e., 

_ e 2 Nuclei ^ 

Vext(r) = V — (6) 

4?r£ o j^i l r ~ r «l 

where Z a is the atomic number and r a is the position 
of the nucleus a. N nuc i e i is the number of nuclei in the 
system. In the three dimensional space ]R 3 , the Hartree 
potential can be rewritten as the solution of the Poisson 
equation 

V 2 v H (r) = -4nr-^—n(r), (7) 

47T£ 

where the boundary condition for isolated systems is 
vh when |r| — * oo. (Also periodic and other bound- 
ary conditions are possible but are not discussed in this 
paper.) 

As the Hartree potential, the density and thus the 
Kohn-Sham wavefunctions vanish at the infinity (or in 
practice at the boundary dfl of the computational do- 
main fi), the above Eqs. |3|) and ([7|) can be cast into the 
weak variational formulation using integration by parts, 
i.e., 

($\H KS \A) = 

l $(r) (^: v2+ ^ (r) )^ (r)d3r 

+ $(r)« e// (r)^(r))d 3 r, (8) 
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and 



($\X7 2 \v H ) = ( $(r)V 2 w ff (r)d 3 r 



V$(r) • Vv H (r)d 3 r, (9) 



where $(r) is a test function which has a square inte- 
grable gradient V$(r). The weak formulation is required 
by the finite element method, and in practice, <E>(r) will 
be a finite element basis function (in the so-called Ritz- 
Galerkin method^, see Eq. ([21]) ). 

As the Hartree potential for charged systems decays 
slowly as r -1 , we have applied counter charges to neu- 
tralize the density. The counter charges are added to the 
electronic density n(r) in Eq. © and are then cancelled 
in Eq. ([5]) by the corresponding analytically calculated 
potential. This provides the r~ 2 decay of the Hartree 
potential, which is sufficient for our purposes. However, 
if required, higher order (e.g., dipole and quadrupole) 
corrections can be applied as well 2 -. 



B. Linear response time-dependent DFT 

In the time-dependent DFT, there exist no variational 
principle, but the quantum mechanical action 

A[ip] = jf 1 dt(m\^§- t - H(t)\m) (10) 

provides an analogous quantity to the total energy of 
the ground-state DFT. The time-dependent Kohn-Sham 
Schrodingcr equation reads as 



2m, 



-V 2 + v eff [n](T,t) VfcM) 



This equation is an initial value problem and can be 
solved using a time-propagation scheme^. However, if 
the external perturbation is small, the density response 
of the system can be written as a series 

n(r,w) = n (0) (r)+n (1) (r,w)+n (2) (r,w) + ..., (12) 

with the linear response term 

n( 1] (r,u)= f dV'x(r,r»/'(r». (13) 



Above, x is the linear response function and «W is the 
external perturbation (e. g. a laser field). The transitions 
can be found by finding the poles of the response function 
x(r, r',w). However, if we arc interested only in the ex- 
citation energies and corresponding oscillator strengths, 
we can use the so-called Casida method. He showed that 
the problem can be solved as an eigenvalue equation&lil 



j'k' 



where f k j = fk — fj, e jk = £j - e fc , and the coupling 
matrix 



d'r I ^rV*(r)V fc (r)Vy(r')^(r') 
e 2 1 



- +f xc (r,T',u>) 



Moreover, 



47T£n r — r' 



, / / / \ Sv xc {r,uj) 
f xc (ru, rw) = - r — l — jr 



(15) 



(16) 



is the exchange-correlation kernel. The oscillator 
strengths arc then 



x/y/z ~ fi2 e 2 



fk > fj 

J2 ^)x/ y /z\l (fk - fj)(tj - efe)7]r } 

jk 



(17) 

where (njk) x /y/z is the x/y/z component of the dipole 
moment vector between the Kohn-Sham states k and j, 
and the index (m) refers to the m th transition. 



1. Confinement potential 

The linear response Kohn-Sham equations use the 
Kohn-Sham states as a basis. Above the ionization limit 
of the system, the spectrum becomes continuous causing 
numerical problems. The eigenvalues of the discretized 
problem bunch together just above the ionization limit. 
For a practical calculation this is not desirable because 
certain transitions have very many different contribu- 
tions due to the eigenstates in the Kohn-Sham contin- 
uum and the importance of most of them is minor be- 
cause the states have a relatively small amplitude near 
the molecule. 

To spread the eigenvalue spectrum above the ionization 
limit, and to increase the relative importance of the rel- 
evant unoccupied states, we use a modified Kohn-Sham 
basis {?/>/c(r)}. The basis is constructed by applying an 
auxiliary confinement potential in the ground-state cal- 
culation. The choice of the potential is in principle ar- 
bitrary, but in order to fill the above requirements, we 
have chosen the form 



2 l^mn 



(r) - R c \ 2 , if r min (v) > R c 



x lyw = fl 2 j jk , (14) 



0, otherwise, 

(18) 

where r m j n (r) = min/j a \r — R a \ is the distance to the 
closest atom, and /c c and R c are parameters to be chosen. 
Thus, the auxiliary potential is zero close to the atoms 
but becomes gradually more repulsive further away. Far 
away from the system, the auxiliary potential is a spher- 
ically symmetric harmonic potential. Now, all states are 
bound. 
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After the ground-state calculation with the auxiliary 
confinement potential the resulting Kohn-Sham states 
{ - 0fc(r)} are taken as the new basis, the auxiliary con- 
finement potential is removed, and the ground-state cal- 
culation is repeated in the new basis. Finally, the linear 
response calculation is carried out in the new basis. 

Introducing the auxiliary confinement potential allows 
us to balance between the number of unoccupied states 
and the quality of the low energy part of the spectrum. 
We want to stress out that this is purely a mathemati- 
cal trick in order to alter the basis of the linear response 
calculation in such a way that the low energy transitions 
converge more quickly. The physics is not altered. The 
calculated linear response spectrum with and without an 
auxiliary confinement potential should give the same re- 
sult when all the Kohn-Sham states (occupied and un- 
occupied) are used as they span the same original finite 
element space Vh- Also, as the confinement potential de- 
termines the linear response basis, the final result of a 
converged calculation is independent of the original basis 
where the Kohn-Sham states were solved, e.g., converged 
atomic orbital and real-space calculations should give the 
same result. 

The choice of the parameters R c and k c is not an obvi- 
ous task and some testing is required to find appropriate 
values. However, the testing can be done as a linear prob- 
lem by fixing the density, because the confinement should 
not change the ground-state. 

C. Finite-Element Discretization 

In the finite-element method the computational do- 
main ft is divided into small, polyhedral regions called 
elements. This division is denoted by Th- For our pur- 
poses it is sufficient to use tetrahedra. Other popular 
choices are hexahedra, pyramids and prisms. The divi- 
sion of fl is handled by an external mesh-generator that 
can either i) generate the mesh for a given geometry or ii) 
calculate the Delaunay tetrahedralization of a given set of 
points. We have chosen the latter option and the points 
for the mesh are generated as specified in Section III C 11 

Once the division of the domain Q is complete the 
space of approximation, Vh, can be defined. For the 
finite-element method this is taken to be continuous, 
piecewise polynomial functions, i.e. 

V h = {vh G C(fi) | {vh)\K G n p } MK e Th (19) 

where K is an element, H p denotes polynomials of order 
p, h refers to the size of the elements in the mesh, and 
C(Q) refers to continuous functions in the domain. In 
general, the order p can vary from one element to an- 
other as long as the continuity condition Vh G C(O) is 
respected but in our calculations we choose to keep p 
fixed throughout the mesh. The value of p decides if the 
method is considered to be of high-order and the usual 
requirement is p > 3 for a high-order method. Also, if 
the convergence is obtained via increasing the order of 



polynomials rather than refining the mesh the method 
is called the p-method. The mesh refinement approach 
gives an /i-method and combining these approaches leads 
to an /ip-mcthod^I. 

Next, a basis for the space Vh must be chosen. The 
canonical way for the high-order method is to divide the 
local basis functions of a single element into four disjoint 
sets: nodal functions, edge functions, face functions, and 
bubble functions. The nodal functions are first order 
polynomials that have a value one at one of the vertices 
and zero at others. The edge functions are polynomials 
up-to an order p and they are non-zero only on one of the 
edges of the element. The face functions are similar to 
the edge functions but they are in correspondence with 
the faces of the element. Finally, the bubble functions are 
zero on all the vertices, edges and faces of the element 
but non-zero inside the element. The actual basis func- 
tions are generated using products of one-dimensional in- 
tegrated Legendre polynomials over the interval [— 1 , 1] . 
Note that due to the continuity requirements the basis 
functions actually extend over several elements that share 
the same geometrical feature (see Fig. [T|). 




FIG. 1: Schematic view of finite element basis functions in 
2D: a) vertex, b) edge, and c) bubble basis functions 

In practise, the basis functions for an clement K in the 
mesh are generated using a reference element, K, and 
(affine) mappings F : K — > K . Then the basis functions 
on an element K can be written as images of the basis 
functions on the reference element, i.e., 

^(r) = ^(F- 1 (r)), (20) 

reducing the programming effort to K. 

Once the basis {4>j}j=i for the space Vh is ready for 
use an approximation to the Kohn-Sham orbitals can be 
looked for in the form ^fc(r) = Ylj=i c j<A?'( r )- There are 
many ways to find the coefficients Cj but in the finite- 
element method the variational approach is used. This 
leads to an equation for the state k 

N b N b 

^(falHKslfa)^ = e fc ^(<^ i )c^, i= 1, ...,N b , 

(21) 

that reads in matrix form as 

Hc k = e k Sc k , (22) 

where 

= ((j)i\H K s\(l)j), Sij = (<f>i\<t>j) = / 0i(r)(/>j(r)dr. 

(23) 
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A few observations are in order. First, since the finite- 
element basis functions are strictly localized in space the 
matrices H and S are sparse. This not only allows for 
but actually dictates the use of sparse matrix technolo- 
gies. Second, if the domain £1 is large enough so that 
selecting the zero boundary conditions on dQ is justified 
the variational formulation (|21| holds and consequently 
the matrix H is also symmetric. In this case the fact that 
the basis functions <f>i don't have continuous derivatives 
across the element borders is not an obstacle since in (|2Tj) 
only a square integrable gradient is required for the basis 
functions (see Eq. ©). 



1. Mesh generation 

The mesh is generated by merging structured atomic 
meshes to a molecular mesh. The nodes of atomic meshes 
consist of layers of vertices of polyhedra. The radius of 
the layer is changed as — q k ro with tq and q as 
parameters, and k C Z (— n < k < m; n,m C N). The 
choice of polyhedra is arbitrary, but they should provide 
tetrahedra of good quality (our quality requirements are 
explained below in this section). We have chosen to use 
deltoidal icositetrahedron and its dual, rhombicubocta- 
hedron, both shown in Fig. [21 





FIG. 2: Polyhedra used in atomic meshes of a) deltoidal 
icositetrahedron and b) rhombicuboctahedron 

The zeroth layer is chosen relative to the size of the 
highest occupied atomic orbital 7'o = (2/) _1 / 2 /4, where 
/ is the first ionization energy. The layers with negative 
indices are created until the radius of the layer is of the 
order of the lowest state rfc min < Z Q _1 /128. The factors 
j and are somewhat arbitrary at the moment, but 
are sufficient for systems under study. If necessary one 
extra layer is added, as the last layer should be deltoidal 
icositetrahedron to ensure good quality of the elements 
around the nuclei. The inner part of the mesh is finalized 
by adding one node to the nucleus R a . 

The nodes of the layers with positive indices are added 
only if the node is inside the atomic mesh region, i.e., 
not in the molecular mesh region. The node of atom a is 
in the molecular region if 



'Jab 



R 6 -R |/|r-RJ- 



R„ Rj, 



R„ 



r-BJ iRb-Ral 



</%-!) 
(24) 

for all other nuclei 6, where g ao = ?"o/( r 'o+ r o) are the rela- 
tive sizes with respect to the other nuclei, and (3 is chosen 



to be |. In practice, this procedure creates an empty 
space between atoms, which reaches closer to smaller 
atoms than larger ones, and its thickness is proportional 
to the distance between the closest pair of atoms. For 
each pair of atoms the atomic regions are inside two 
halves of an elliptical hypcrboloid. 

The nodes for the molecular mesh region are then cre- 
ated by first adding a spherical layer of nodes around 
the center of atomic charges R cc . The layer forms the 
boundary of the simulation cell and has a radius equal 
to rgfi — gmax; |rj — R cc |, the radius of the furthest 
node from the center of atomic charged multiplied by 
the layer ratio q. Then an initial molecular mesh is cre- 
ated by a Delaunay tetrahedralization^ of the nodes (see 
Fig. [3} . The molecular mesh is then refined by Delaunay 
refinement 3 -, i.e., by inserting nodes at the circumcen- 
ters (the center of circumsphere) of too large elements 
one at the time and repeating Delaunay tetrahedraliza- 
tion after each insertion. An clement is deemed too large, 
if its longest edge is longer than the longest edge of an 
element in the atomic mesh with the same distance from 
the closest atom. Or, if its average edge length is longer 
than the average edge length of an element in the atomic 
mesh with the same distance from the closest atom. (Ob- 
viously, the elements, which are connected to the nuclei, 
are ignored.) After refining the mesh to fill the size con- 
straints, the quality of the element s is ensure d. All ele- 
ments with a too small ratio s = y/3ri n /r C i rc , where ri n 
is the radius of the inscribing sphere, and r circ is the ra- 
dius of the circumsphere, are Delaunay refined as above 
until no elements with low quality are present. Keep- 
ing the ratio s relatively close to one will ensure that all 
angles (dihedral and face) are neither too large nor too 
small 1 !^ 3 ^. This is one of the standard measures for the 
quality of an element. The elements which are connected 
to the boundary nodes are not currently being refined. 
However, the quality of these elements is not very im- 
portant because the solution is practically zero in this 
region. 




FIG. 3: Initial molecular mesh for the CO molecule before 
refinement and improvement. The elements of the molecular 
region are shown in pink. 

The resulting molecular mesh is somewhat finer than 
the atomic meshes, but because the main interest is in 
the molecular region, we consider it justified to slightly 
overdiscretize this region. An example of a molecular 
mesh for benzene CqB-q with q = y/2, s = y/l/3, and 15 
outer layers is shown in Fig. [5] The diameter of mesh is 
55 A. 
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b) 
















FIG. 4: Cut plane of the molecular mesh of the CeH6 molecule 
with parameters q — \/2, s > \/l/3, and 15 outer layers 
(see text): a) the complete mesh (diameter of 55 A), b) the 
atomic mesh near a carbon nucleus, and c) the close-up of the 
molecular region. 



can be used. Above, 

n,- fe (r)=V*(r)V fe (r) (26) 
is the pair density. 

III. RESULTS AND DISCUSSION 

We demonstrate our ground state DFT and linear re- 
sponse TDDFT methods by applying them to atoms and 
small molecules. We calculated hydrogen, carbon, and 
oxygen atoms, and hydrogen, carbon monoxide, and ben- 
zene molecules. We calculated optical absorption spec- 
tra for a beryllium atom, sodium dimer, and benzene 
molecule. The convergence properties are discussed in 
both cases. 



D. Implementation 

Our current implementation is based on the ELMER 
finite element software package^, and the Delaunay 
tetrahedralization is done using TETGENi 2 .^ , The 
ground-state Kohn-Sham system was solved with the 
self-consistent iteration scheme. The locally optimal 
block preconditioned conjugated gradient (LOBPCG) 22 
method was applied to the linearized Kohn-Sham eigen- 
value problem (Eq. ([5])), and the convergence rate of the 
nonlinear system was enhanced with the Pulay mixing 35 
procedure for the density. The electronic charge was com- 
pensated by Gaussian counter-charges at nuclei in the 
Poisson equation (Eq. ©), and then a cancelling poten- 
tial for the counter-charges was added in the assembly 
of the Hamiltonian matrix (in Eq. (|2T|) '). Preconditioner 
for the eigenvalue problem was chosen to be the incom- 
plete Cholesky factorization 19 for T + aS, where T is the 
kinetic energy operator and a was chosen to be 13.6 eV. 

In the linear-response calculation, the main effort is in 
calculating the integrals of the matrix elements in equa- 
tion (|15[) . Each row of the matrix is independent of the 
other rows, and thus the problem is trivial to parallelize 
over the rows of the matrix. Also, some of the matrix 
elements (and rows) can be ignored beforehand as their 
eigenvalue difference is clearly outside the relevant energy 
interval, e.g., transitions from core states. The exchange- 
correlation kernel f X c(^, r', lu) requires the second func- 
tional derivative of the exchange-correlation functional 
with respect to the density. However, when the second 
derivative is not available, the finite-difference approxi- 
mation 

/ 3 SE XC . 
rr— — ; — -n,-t r ) = 
n{r)n{v') jhK ' 

r Vxc[n + An jk }(r) - v xc [n - An jk ](r) 



A. Ground state DFT 

We applied the local density approximation (LDA) 
functional with the Perdew-Wang parametrization^ 3 . in 
all calculations, and all results are for spin-compensated 
systems. In all calculations, the simulation cell diameter 
was approximately 50 A, and the geometrical coarsening 
factor q — \/2. 

The total energies of the atoms and molecules cal- 
culated with increasing polynomial degree are shown 
in Tables Q] and HH and the atomization energies of 
the molecules in Table IIIII We have used for H2 
and CO the bond lengths of 0.75A and l.lA, respec- 
tively. CgHg has a planar geometry with atomic po- 
sitions of C: (0.000, ±1.396)A, (±1.209, ±0.698)A, and 
H: (0.000, ±2.479)A, (±2.147, ±1.240)A used. The H 2 
mesh had 12xl0 3 , 41xl0 3 , and 96xl0 3 de grees of 
freedom (DOFs); the CO mesh had 14xl0 3 , 46xl0 3 , 
and 109 xlO 3 DOFs; and the C 6 H 6 mesh had 59xl0 3 , 
199xl0 3 , and 470xl0 3 (DOFs), for element degrees p = 
2, p = 3, and p = 4, respectively. The corresponding re- 
sults calculated with very high accuracy (~1 meV) using 
the electronic structure program FHI-aimsS are shown 
on the last rows of the tables. As one can see, the to- 
tal energy requires a high polynomial degree (p > 3) 
to converge within an error below 100 meV. However, in 
practice one is interested in the atomization energy of the 
system, which is the difference of the total energies be- 
tween the system and the corresponding isolated atoms. 
The cancellation of errors leads to a significant improve- 
ment in the accuracy, and already the 2 nd and 3 rd degree 
polynomials produce results with errors around 100 meV 
and 10 meV, respectively. The maximal cancellation was 
obtained by using the same mesh for isolated atoms as for 
the molecule, which can be considered as a kind of a basis 
set superposition error, {i.e., a counterpoise) correction^. 
The energies of the isolated atoms are lower in the molec- 
ular mesh than in the atomistic mesh. This is because 
the molecular mesh is denser than the atomistic mesh as 
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one wants to guarantee the good description of the bond- 
ing regions. The total and atomization energies are well 
converged with respect to the simulation cell diameter. 
We found less than one meV difference in range from 21 A 
to 15lA for the CO molecule. 

We performed nonrelativistic calculations for elements 
Zn, I, Hg, and At in order to test the quality of the dis- 
cretization in the case of heavy elements. We found that 
elements with d-electrons perform relatively well, e.g., 
the atomization energy of the I2 molecule (— 2.400eV, 
— 3.015eV, — 3.031eV for p = 2,3,4, respectively, and 
— 3.037eV for FHI-aims) has ~2-4 times larger errors 
than the CgHe molecule. Elements with f-electrons per- 
form much worse, e.g., At 2 has on order of magnitude 
larger errors than CgHg molecule. This is due to insuf- 
ficient angular degrees of freedom as the eigenvalues of 
the f-orbitals split (and d-orbitals split slightly) in en- 
ergy whereas p-orbitals do not. Our estimate is that one 
would need ^2-4 times more angular DOFs for heavy el- 
ements, which in addition to ~50% more radial DOFs is 
~3-6 times more DOFs than for carbon. 



TABLE I: Total energies of H, C, and O atoms calculated 
using elements with degrees p — 2 — 4. 

E LDA [eV] 
H C O 



TABLE III: Atomization energies of H 2 , CO and CeH 6 
molecules calculated using elements with degrees p — 2 — 4. 



p = 2 
P = 3 
p = 4 
FHI-aims 



-12.0509 
-12.1245 
-12.1271 
-12.127 



-1011.1067 
-1018.1042 
-1018.3581 
-1018.369 



-2011.1970 
-2025.8759 
-2026.4268 
-2026.451 







H 2 


AE L da [eV] 
CO 


CeH6 


p = 


2 


-6.6838 


-15.7573 


-81.0894 


p = 


3 


-6.6996 


-15.7162 


-80.8599 


p = 


4 


-6.6999 


-15.7114 


-80.8541 


FHI 


-aims 


-6.700 


-15.709 


-80.852 



In improperly generated meshes, this can cause severe 
problems as the potential energy surface may have sig- 
nificant artificial oscillations and discontinuities. For this 
reason, we recommend a slightly denser discretization 
of the bonding regions compared to the atomic regions. 
Based on our experimentations on diatomic molecules, 
this is sufficient and forces with a quality comparable to 
that from commonly used codes, such as the real-space 
code GPAW— , are obtained. 

Note, that we have given two different values for the 
atomization energy of CO at the bond length of Rco = 
1.1 A for each element degree p (see Tables rnll and \TV\ . 
Because the mesh generation is not unique for a given 
molecule but rather for given Cartesian positions and 
the order in which the atoms are given, the difference 
is due to different meshes obtained from two different 
generator inputs. However, the difference is one order of 
magnitude smaller than the error in the atomization en- 
ergy. The dipole moment shows errors less than 0.01 eA 
and 0.001 eA when using 2nd and 3rd order polynomials, 
respectively. 



TABLE II: Total energies of H 2 , CO and CeH 6 molecules 
calculated using elements with degrees p — 2 — 4. 

Elba [eV] 
H 2 CO CeHg 



p = 2 
P = 3 
p = 4 
FHI-aims 



-30.8407 
-30.9510 
-30.9542 
-30.954 



-3039.5322 
-3059.7776 
-3060.5009 
-3060.529 



-6226.5746 
-6262.5718 
-6263.7841 
-6263.829 



TABLE IV: Atomization energy of the CO molecule at dif- 
ferent bond lengths calculated using elements with degrees 
p = 2 - 4. 

AE L ba [eV] 

-Rco [A] p = 2 p = 3 p = 4 FHI-aims 



0.8 
1.0 
1.1 
1.2 
1.4 
1.8 
2.4 



-0.1272 
-14.4446 
-15.7584 
-15.6235 
-13.5165 
-8.5848 
-4.0303 



-0.6514 
-14.4495 
-15.7175 
-15.4910 
-13.3027 
-8.3963 
-3.9093 



-0.6648 
-14.4464 
-15.7115 
-15.4845 
-13.2934 
-8.3875 
-3.9043 



-0.660 
-14.444 
-15.709 
-15.482 
-13.292 
-8.386 
-3.903 



Tables IIVI and [V] show the convergence of the poten- 
tial energy surface and the dipole moment, respectively, 
calculated with elements with degrees p — 2 — 4. The po- 
tential energy surface shows no "egg-box effect" , known 
to exists in uniform real-space grids 30 . However, there 
exists a similar kind of effect. For example in a diatomic 
molecule, when the bond length is changed, new elements 
are created into or old ones are removed from the mesh. 



In Table IVI1 we show the Kohn-Sham eigenvalues of 
the CgHg molecule. The core eigenvalues exhibit much 
larger absolute errors than the valence eigenvalues, but 
the relative errors are of same order. The valence eigen- 
values converge similarly to the atomization energies, 
which is reasonable as the errors in the core eigenvalues 
cancel when taking the differences. The remaining er- 
ror is mainly due to the valence states and the molecular 
orbitals which they form. 
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TABLE V: Dipole moment of the CO molecule at different 
bond lengths calculated using elements with degrees p — 2 — 4. 

[Ilda [eA] 



r-i r X l 

Rco [A] 


P = 2 


p = 3 


p = 4 


FHI-aims 


0.8 


0.2454 


0.2402 


0.2400 


0.2398 


1.0 


0.1390 


0.1311 


0.1307 


0.1305 


1.1 


0.0745 


0.0669 


0.0666 


0.0663 


1.2 


0.0064 


-0.0010 


-0.0013 


-0.0015 


1.4 


-0.1330 


-0.1397 


-0.1399 


-0.1399 


1.8 


-0.3792 


-0.3792 


-0.3791 


-0.3790 


2.4 


-0.6084 


-0.5996 


-0.5992 


-0.5991 



TABLE VI: Kohn-Sham orbital energies (eigenvalues) of the 
CeHe molecule calculated using elements with degrees p = 
2-4. 







tLDA 


[eV] 




tate 


P = 2 


P = 3 


p = 4 


FHI-aims 


1 


-264.6616 


-266.3819 


-266.4388 


-266.4382 


6 


-264.6087 


-266.3585 


-266.4156 


-266.4150 


7 


-21.1552 


-21.1155 


-21.1557 


-21.1560 


8 


-18.3474 


-18.3608 


-18.3616 


-18.3619 


9 


-18.3404 


-18.3597 


-18.3609 


-18.3612 


18 


-8.2867 


-8.2915 


-8.2915 


-8.2917 


19 


-8.2839 


-8.2895 


-8.2895 


-8.2897 


20 


-6.5401 


-6.5341 


-6.5343 


-6.5338 


21 


-6.5385 


-6.5339 


-6.5342 


-6.5338 



confinement potentials can be seen in Figs. [6] and [7J A 
stronger confinement provides a faster convergence with 
respect to the number of states, but at the same time, 
the converged transition energies are shifted to slightly 
higher energies. A weaker confinement provides energies 
which are better converged, but the convergence may not 
be reached with the available number of states, as in 
the case of k c = lO" 4 ^/^ in Fig. [7J In Fig. the 
number of states was increased to 250 which yields an 
error less than 30 meVs. Obviously, the transitions at 
higher energies are more sensitive to confinement than 
transitions at low energies. The convergence with respect 
to the number of states included in the calculation is not 
smooth, but rather has a step every time a new state 
contributing to the transition is included in the basis. 
The step is not always smaller than the previous one, 
and it can be hard to decide whether the spectrum has 
converged by observing the convergence with respect to 
the number of states. 
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FIG. 5: Optical absorption spectra of the beryllium atom 
calculated using elements with degrees p — 2 (solid) and p — 3 
(dashed) . The inset shows a magnification of the high-energy 
region. 



B. Linear-response TDDFT 

For the linear-response TDDFT calculations we used 
actually a slightly different mesh generation scheme than 
that described above in Sec. Ill C 11 This old scheme, de- 
veloped also by us, uses i) different alternating polyhedra, 
i.e. tetrakis hexahedron and slightly compressed (larger 
cubic faces) truncated cuboctahedron, for atomic meshes, 
and ii) different quality measures, i.e. dihedral angles 
and aspect ratio (longest edge / smallest side height), 
than the current one. Compared to the old one, the cur- 
rent mesh generation scheme is simpler and it produces 
higher quality atomic meshes. However, the difference in 
quality is negligible when applying to the linear-response 
TDDFT. 

First, we consider a simple test system, a beryllium 
atom, to demonstrate the convergence properties. We 
begin with the polynomial degrees p — 2 and p = 3, 
150 states, the confinement radius R c = 8.0ao and the 
force constant k c = 10 -3 J5/,/oq. The resulting spectra 
are shown in Fig. [5l Increasing the polynomial degree 
of the elements has only a small effect of ~20 meV for 
the first peak position, and of ~70 meV for the second 
peak position {hv p —^ > hu p= 2). The effect of different 
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FIG. 6: Optical absorption spectrum of the beryllium atom 
calculated using the confinement potential parameters (from 
the highest curve to the lowest one): k c = W- 2 E h /al Rc = 
4.0a ; k c = W- 3 E h /a%, R c = 4.0a ; k c = 10' 4 E h /a 2 , R c = 
4.0a ; k c = 10~ 2 E h /a%, R c = 8.0a ; k c = W~ :i E h /al, R c = 
8.O00; and k c — 10 -4 Eh/a 2 ,, R c = 8.0ao. The spectra are 
separated by shifting the zero level. 
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FIG. 7: Convergence of the position of the first transition 
peak in the optical absorption spectrum of the beryllium 
atom with respect to the number of states included in the 
calculation. The confinement potential parameters used are: 
k c = lQ- 2 E h /a 2 Q , R c = 4.0a (dash-dotted); k c = W~ 3 E h /af h 
R c = 4.0a (dashed); and k c = 10~ 4 E h /a 2 l , R c = 4.0a 
(solid). 



Next, we examined two molecular test systems, the 
sodium dimer Na2 and the benzene molecule CgHg. 
The simulated photoabsorption spectrum of the Na2 is 
shown in Fig. [H The calculation included 250 states, 
and two different confinement potentials were used: one 
with R c = 8.0ao and k c = lO~ 2 Eh/ 'oq, and one with 
R c = 8.0ao and k c = 10 -3 Eh/ Oq. Practically, the same 
result of 2.15 eV was obtained for the first peak wit the 
two sets of parameters. For the second one there is a 
small shift from 2.69 eV to 2.72 eV. In contrast, the third 
clearly visible peak in the spectrum shows a remarkable 
shift from 3.4 eV to 4.3 eV. 
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FIG. 8: Optical absorption spectra of the sodium dimer cal- 
culated with the confinement potential parameters of k c — 
W~ 2 E h /al, R c = 8.0a (solid); and k c = 10" 3 E h /al, R c = 
8.0ao (dashed). 

The photoabsorption spectrum of the benzene 
molecule is shown in Fig. [9] Again two different con- 
finement potentials were used, one with R c = 4.0ao 
and k c = 10~ 2 Eh /a 2 ,, and one with R c — A.Oao and 
k c = lO~ 3 Eh/ciQ. The spectrum with the weaker confine- 
ment (k c — 10 -3 ) is not converged yet with 250 states, 
which corresponds already nearly 4 million matrix ele- 
ments. The spectrum with the stronger confinement and 
150 states is converged in the lower energy part of the 
spectrum, and reproduces correctly the main experimen- 
tal peak around 7 eV. It also shows the beginning of a 
broad feature above 9 eV in agreement with the experi- 
ment. 
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FIG. 9: Optical absorption spectra of the benzene molecule: 
a) the spectra calculated using the confinement potential pa- 
rameters k c = 10 -2 Eh/af), R c = 4.0ao (solid); and k c = 
10 -3 Eh /do, Rc = 4.0ao (dashed), b) the spectrum calculated 
using the confinement potential parameters k c = 10 -2 Eh/a.o, 
R c — 4.0ao (solid) and the experimental spectrum 23 (dashed). 



C. Computational details 

The ground state DFT calculations were performed as 
serial calculations, and the time consumed ranged from 
minutes (hydrogen atom) to tens of hours (benzene with 
p = 4). All calculations were done on 2.6 GHz AMD 
Opteron dual-core processors. As the systems were rel- 
atively small, the storage requirements of the matrices 
were much larger than those of the wavefunctions. The 
number of nonzero entries in the matrices ranged from 
1 x 10 5 (H, p = 2) to 4 x 10 7 (C 6 H 6 , p = 4). The num- 
ber of degrees of freedom ranged from 5000 (H, p = 2) 
to 5 x 10 5 (C 6 H 6 , p = 4). The linear response TDDFT 
was parallelized over the rows of the Casida matrix, and 
the absorption spectrum of benzene was calculated using 
several hundreds of processors. 

We consider the performance attained adequate for an 
initial "proof-of-concept" implementation. And, we ex- 
pect to increase the speed substantially by employing 
more sophisticated methods. Especially, the precondi- 
tioning of the eigenvalue problem and improved initial 
guesses for Kohn-Sham wavefunctions are expected to 
result in remarkable improvements. 



IV. CONCLUSIONS 

We have described and implemented a high-order hi- 
erarchical finite element method on unstructured meshes 
for all-electron DFT and TDDFT method. Our finite 
element mesh generation scheme assures the quality of 
the elements in the mesh by merging high-quality, struc- 
tured atomic meshes to an initial molecular mesh, which 
is then refined to meet the size and shape requirements by 
applying the Delaunay refinement method. The ground 
state DFT calculations were performed using elements 
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with degrees p = 2 — 4, which provide increasing levels of 
accuracy down to a few me Vs. 

We also described a flexible way to construct a basis 
for the finite-element linear response TDDFT calcula- 
tion. By applying an auxiliary confinement potential to 
the ground-state calculation, the basis can be tuned to 
balance between accuracy and computational cost. The 
convergence properties of the optical absorption spec- 
trum were discussed in the cases of the beryllium atom, 
and the sodium dimer and benzene molecules. 

The initial implementation has proved the applicabil- 
ity of the hierarchical finite element method on unstruc- 
tured meshes to all-electron DFT and TDDFT. However, 
there exist several open question, which must be further 
studied and improved, for example, the preconditioning 
of the eigenvalue problem. As the finite element method 
is well-suited for the domain decomposition, the paral- 
lel implementation would provide access to much larger 
systems within reasonable execution times. As most of 
the applications do not need full all-electron solutions, 
the PAW method or a similar treatment should speed up 



calculations remarkably in these cases. Magnetic fields, 
rclativistic effects, and quantum mechanical forces for 
atoms will be implemented in order to broaden the ap- 
plicability of the method. Finally, we believe that the 
most promising application areas for our method are be- 
yond the ground-state and linear response calculations, 
for example, in the time-propagation TDDFT scheme. 
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